Fix corpus test round-trip failures and achieve Rust escape_debug parity#143
Merged
philhassey merged 11 commits intocedar-policy:mainfrom Mar 18, 2026
Merged
Conversation
Test policy sets through cedar and JSON round-trips, and assert entity map equality after JSON round-trip. Signed-off-by: Phil Hassey <phil@strongdm.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Change the Set field in nodeJSON from arrayJSON to *arrayJSON so that
omitempty distinguishes "no Set" from "empty Set", preventing empty
Cedar sets from marshaling as {} instead of {"Set":[]}.
In Record.UnmarshalJSON, return a zero-value Record when the input
map is empty, so that round-tripping through {"tags":{}} preserves
nil-map equality with entities that had no tags field originally.
Signed-off-by: Phil Hassey <phil@strongdm.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
philhassey
commented
Mar 18, 2026
philhassey
commented
Mar 18, 2026
philhassey
commented
Mar 18, 2026
philhassey
commented
Mar 18, 2026
patjakdev
reviewed
Mar 18, 2026
philhassey
commented
Mar 18, 2026
philhassey
commented
Mar 18, 2026
abafe5c to
ac4b77e
Compare
In Pattern.MarshalJSON, always emit a Literal component when the
component is not a wildcard, even when the literal is "". Use
json.Marshal for the literal value to properly escape special
characters like null bytes.
In ParsePattern, when the input is an empty string, produce a single
empty-literal component so that "like \"\"" round-trips correctly
through JSON as [{"Literal":""}] instead of [].
Signed-off-by: Phil Hassey <phil@strongdm.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
EntityUID.MarshalCedar() was delegating to String() which uses
strconv.Quote, producing Go-style escapes (\b, \a, \f, \v) that are
not valid in Cedar. Switch to rust.EscapeString which produces
Cedar-compatible escapes (\u{8}, \u{7}, \u{c}, \u{b}).
Leave String() unchanged for Go-idiomatic display/debugging.
Signed-off-by: Phil Hassey <phil@strongdm.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
87caffc to
0749e79
Compare
Pattern.MarshalCedar() was using strconv.Quote which produces
Go-style escapes (\x00, \v) that are not valid in Cedar. Switch
to rust.EscapeString for Cedar-compatible escapes (\0, \u{b}).
Signed-off-by: Phil Hassey <phil@strongdm.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
canMarshalAsIdent("") returned true because the loop over an empty
string iterates zero times. It also accepted reserved keywords like
"true", "false", "if", "in", etc., producing invalid Cedar like
`context.true` instead of `context["true"]`.
Add an early return for empty strings and reserved keywords, using
the existing IsReservedKeyword helper.
Signed-off-by: Phil Hassey <phil@strongdm.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Binary operators were not parenthesizing children correctly based on associativity: - Non-associative relation ops (==, !=, <, <=, >, >=, in) and keyword ops (has, like, is, is-in) need to parenthesize both operands at the same precedence, since (a == b) == c is valid but a == b == c is not. - Left-associative ops (+, -, *, &&, ||) need to parenthesize their right operand at the same precedence, since (a - b) - (c - d) must not be flattened to a - b - c - d which changes semantics. Split marshalInfixBinaryOp to take separate left/right precedence levels. Left-associative ops pass (p, p+1) and non-associative ops pass (p+1, p+1). Signed-off-by: Phil Hassey <phil@strongdm.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two issues with Record in nodeJSON:
1. Record field was a value type (recordJSON = map[string]nodeJSON)
with omitempty, so empty records were silently dropped during
JSON marshal, producing {} instead of {"Record":{}}.
Fix: change to *recordJSON pointer (same pattern as the Set fix).
2. recordJSON mapped keys to nodeJSON values (not pointers). Since
nodeJSON.MarshalJSON has a pointer receiver, Go's json.Marshal
cannot call it on map values (can't take address of map element).
This caused the default marshaler to be used, which skips the
ExtensionCall field (tagged json:"-"), silently dropping extension
calls like duration() and datetime() inside records.
Fix: change recordJSON to map[string]*nodeJSON so the custom
MarshalJSON is always invoked.
Signed-off-by: Phil Hassey <phil@strongdm.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the approximate EscapeString/ShouldEscape implementation with an exact port of Rust's char::escape_debug() and str::escape_debug(): - Port Rust's is_printable() lookup tables from printable.rs verbatim, replacing Go's unicode.IsPrint() which disagrees at boundaries like U+00A0, U+00AD, U+FFFC, and U+FFFD. - Port Rust's Grapheme_Extend property as Mn + Me + Other_Grapheme_Extend, replacing the incorrect Mn + Me + Mc check. Mc (spacing combining marks like U+0903) are NOT in Grapheme_Extend; Other_Grapheme_Extend chars like U+FF9E and U+FF9F now correctly detected. - Implement str::escape_debug first/continuation distinction: grapheme extend chars are escaped as first char (ESCAPE_ALL) but passed through raw in continuation positions (CharEscapeDebugContinue). - Add EscapeCharAll for Pattern marshaling, matching Rust Cedar's per-character char::escape_debug() with ESCAPE_ALL on every char. Signed-off-by: Phil Hassey <phil@strongdm.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Validate escapeRune, EscapeString, EscapeCharAll, isPrintable, and isGraphemeExtended against exact Rust 1.93 char::escape_debug() and str::escape_debug() output for 160+ codepoints covering: - All C0/C1 controls, full Latin-1 supplement boundary (0xA0-0xBF) - Combining marks (Mn, Me, Mc), format chars, separators, BOM - Halfwidth katakana, hangul jungseong, tag chars, variation selectors - SMP and supplementary plane boundaries - str first-vs-continuation grapheme extend distinction - Round-trip correctness through Unquote for both escape modes Signed-off-by: Phil Hassey <phil@strongdm.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Each corpus generation target now uses a private mktemp -d with a trap for cleanup on failure, eliminating shared /tmp/corpus-tests collisions and leaked temp files. Quote basename arguments for safety with special characters. Add extracted corpus directories to .gitignore. Signed-off-by: Phil Hassey <phil@strongdm.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
0749e79 to
d627758
Compare
patjakdev
approved these changes
Mar 18, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR fixes all ~410 corpus round-trip test failures (Cedar text and JSON) and achieves 1:1 parity with Rust's
char::escape_debug()/str::escape_debug()for string escaping.Round-trip test infrastructure
Cedar text escaping (EntityUID, String, Pattern)
MarshalCedar()now usesrust.EscapeStringinstead ofstrconv.Quote, fixing invalid Go-style escapes (\b,\a,\f,\v) that Cedar's parser rejectsMarshalCedar()now usesrust.EscapeCharAllinstead ofstrconv.Quote, fixing the same class of invalid escapes (had a TODO acknowledging this)Cedar text operator marshaling
canMarshalAsIdent()now rejects empty strings and reserved keywords (true,false,if,in, etc.), preventing invalid Cedar likecontext.trueinstead ofcontext["true"]==,!=,<,<=,>,>=,in,has,like,is) now parenthesize both operands at same precedence+,-,*,&&,||) now parenthesize their right operand at same precedence, preventing(a - b) - (c - d)from flattening toa - b - c - dJSON round-trip fixes
Setfield fromarrayJSONto*arrayJSONsoomitemptydistinguishes "no Set" from "empty Set"Recordfield fromrecordJSONto*recordJSON(same pattern as Set fix)recordJSONfrommap[string]nodeJSONtomap[string]*nodeJSON. SincenodeJSON.MarshalJSON()has a pointer receiver, Go'sjson.Marshalcouldn't call it on map values, silently dropping extension calls (likeduration(),datetime()) stored in thejson:"-"fieldNewRecord(nil)andNewRecord(RecordMap{})serialize distinctlyRust escape_debug 1:1 parity
is_printable()lookup tables fromlibrary/core/src/unicode/printable.rsverbatim, replacing Go'sunicode.IsPrint()which disagrees at boundaries (U+00A0, U+00AD, U+FFFC, U+FFFD, etc.)Grapheme_Extenddetection asMn + Me + Other_Grapheme_Extend, replacing incorrectMn + Me + Mc. Spacing combining marks (Mc) like U+0903 are NOT inGrapheme_Extend;Other_Grapheme_Extendchars like U+FF9E/U+FF9F are now correctly detectedstr::escape_debug()first/continuation distinction: grapheme extend chars escaped as first char but passed through raw in continuation positionsEscapeCharAll()for Pattern marshaling, matching Rust Cedar's per-characterchar::escape_debug()with ESCAPE_ALLTest coverage
make testandmake linterspass cleanTest plan
make test— all packages pass at 100% coveragemake linters— golangci-lint clean, no insufficient coverage🤖 Generated with Claude Code